{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 1-D convolution for text analytics\n",
    "- It is commonly considered that RNNs are for text analysis and CNNs are for image analysis. However, It could be the other way around as well.\n",
    "- With help of features such as local connectivity, sliding filters, weight sharing, etc., CNNs can be attractive for text analysis as.\n",
    "- Keras has ```Conv1D``` layer, which is similar to ```Conv2D```\n",
    "    - ```Conv1D``` is attractive substitute for recurrent models in learning context-based data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "from keras.models import *\n",
    "from keras.layers import *\n",
    "from keras.datasets import reuters\n",
    "from keras.preprocessing import sequence\n",
    "from keras.utils import to_categorical\n",
    "import matplotlib.pyplot as plt"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## ```Conv1D``` layer\n",
    "- Temporal convolution - \"creates a convolution kernel that is convolved with the layer input over a single spatial (or temporal) dimension to produce a tensor of outputs\"\n",
    "    - input: 3D tensor of shape ```(batch_size, steps, input_dim)```\n",
    "    - output: 3D tensor of shape ```(batch_size, new_steps, filters)```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(?, 8, 5)\n"
     ]
    }
   ],
   "source": [
    "# setting num_filters = 5 and kernel_size = 3\n",
    "# 0-10 timesteps & kernel_size = 3 => 8 new steps\n",
    "conv1d = Conv1D(5, 3, padding = 'valid')(Input(shape = (10, 30)))\n",
    "print(conv1d.shape)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(?, 10, 5)\n"
     ]
    }
   ],
   "source": [
    "# when padding = 'same'\n",
    "conv1d = Conv1D(5, 3, padding = 'same')(Input(shape = (10, 30)))\n",
    "print(conv1d.shape)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(?, 3, 2, 5)\n",
      "(?, 5, 4, 5)\n"
     ]
    }
   ],
   "source": [
    "# comparison to Conv2D layer\n",
    "conv2d = Conv2D(5, (3,3), padding = 'valid')(Input(shape = (5, 4, 3)))\n",
    "print(conv2d.shape)\n",
    "conv2d = Conv2D(5, (3,3), padding = 'same')(Input(shape = (5, 4, 3)))\n",
    "print(conv2d.shape)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Using ```Conv1D``` layer in Network"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# parameters to import dataset\n",
    "def get_reuters(num_words = 3000, maxlen = 50):    \n",
    "    (X_train, y_train), (X_test, y_test) = reuters.load_data(num_words = num_words, maxlen = maxlen)\n",
    "\n",
    "    X_train = sequence.pad_sequences(X_train, maxlen = maxlen, padding = 'post')\n",
    "    X_test = sequence.pad_sequences(X_test, maxlen = maxlen, padding = 'post')\n",
    "    y_train = to_categorical(y_train, num_classes = 46)\n",
    "    y_test = to_categorical(y_test, num_classes = 46)\n",
    "\n",
    "    print(X_train.shape)\n",
    "    print(X_test.shape)\n",
    "    print(y_train.shape)\n",
    "    print(y_test.shape)\n",
    "    \n",
    "    return X_train, X_test, y_train, y_test"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "num_words = 3000\n",
    "max_len = 50\n",
    "embed_size = 100"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(1595, 50)\n",
      "(399, 50)\n",
      "(1595, 46)\n",
      "(399, 46)\n"
     ]
    }
   ],
   "source": [
    "X_train, X_test, y_train, y_test = get_reuters(num_words, max_len)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def one_dim_convolution_model(num_words, embed_size, input_length):\n",
    "    model = Sequential()\n",
    "    model.add(Embedding(num_words, embed_size, input_length = max_len))\n",
    "    model.add(Conv1D(50, 10, activation = 'relu'))\n",
    "    model.add(MaxPooling1D(10))\n",
    "    model.add(GlobalMaxPooling1D())\n",
    "    model.add(Dense(46, activation = 'softmax'))\n",
    "    model.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['acc'])\n",
    "    return model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "_________________________________________________________________\n",
      "Layer (type)                 Output Shape              Param #   \n",
      "=================================================================\n",
      "embedding_3 (Embedding)      (None, 50, 100)           300000    \n",
      "_________________________________________________________________\n",
      "conv1d_12 (Conv1D)           (None, 41, 50)            50050     \n",
      "_________________________________________________________________\n",
      "max_pooling1d_3 (MaxPooling1 (None, 4, 50)             0         \n",
      "_________________________________________________________________\n",
      "global_max_pooling1d_3 (Glob (None, 50)                0         \n",
      "_________________________________________________________________\n",
      "dense_3 (Dense)              (None, 46)                2346      \n",
      "=================================================================\n",
      "Total params: 352,396\n",
      "Trainable params: 352,396\n",
      "Non-trainable params: 0\n",
      "_________________________________________________________________\n"
     ]
    }
   ],
   "source": [
    "model = one_dim_convolution_model(num_words, embed_size, max_len)\n",
    "model.summary()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "history = model.fit(X_train, y_train, epochs = 100, batch_size = 100)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\r",
      " 32/399 [=>............................] - ETA: 0s"
     ]
    }
   ],
   "source": [
    "result = model.evaluate(X_test, y_test)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Test Accuracy:  0.894736842553\n"
     ]
    }
   ],
   "source": [
    "print('Test Accuracy: ', result[1])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Diversifying Size of Kernels\n",
    "- In the previous example, size of kernels were kept constant. However, it is also possible to perform convolution operations with different size of kernels parallel, and merge them afterwards\n",
    "    - This could be done using ```Functional API``` when creating model\n",
    "\n",
    "<br>\n",
    "<img src=\"http://www.wildml.com/wp-content/uploads/2015/11/Screen-Shot-2015-11-06-at-8.03.47-AM.png\" style=\"width: 600px\"/>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(1595, 50)\n",
      "(399, 50)\n",
      "(1595, 46)\n",
      "(399, 46)\n"
     ]
    }
   ],
   "source": [
    "num_words = 3000\n",
    "max_len = 50\n",
    "embed_size = 100\n",
    "kernel_sizes = 5, 10, 15\n",
    "\n",
    "X_train, X_test, y_train, y_test = get_reuters(num_words, max_len)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def one_dim_convolution_model_with_diff_kernels(num_words, embed_size, input_length, kernel_sizes):\n",
    "    inputs = Input(shape = (X_train.shape[1],))\n",
    "    embedded = Embedding(output_dim = embed_size, input_dim = num_words, input_length = max_len)(inputs)\n",
    "    conv_results = []\n",
    "    for kernel_size in kernel_sizes:\n",
    "        x = Conv1D(50, kernel_size, activation = 'relu')(embedded)\n",
    "        x = MaxPooling1D(pool_size = max_len - kernel_size + 1)(x)\n",
    "        conv_results.append(x)\n",
    "    conv_result = concatenate(conv_results)\n",
    "    x = GlobalMaxPooling1D()(conv_result)\n",
    "    outputs = Dense(46, activation = 'softmax')(x)\n",
    "    model = Model(inputs = inputs, outputs = outputs)\n",
    "    model.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['acc'])\n",
    "    return model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "__________________________________________________________________________________________________\n",
      "Layer (type)                    Output Shape         Param #     Connected to                     \n",
      "==================================================================================================\n",
      "input_3 (InputLayer)            (None, 50)           0                                            \n",
      "__________________________________________________________________________________________________\n",
      "embedding_3 (Embedding)         (None, 50, 100)      300000      input_3[0][0]                    \n",
      "__________________________________________________________________________________________________\n",
      "conv1d_5 (Conv1D)               (None, 46, 50)       25050       embedding_3[0][0]                \n",
      "__________________________________________________________________________________________________\n",
      "conv1d_6 (Conv1D)               (None, 41, 50)       50050       embedding_3[0][0]                \n",
      "__________________________________________________________________________________________________\n",
      "conv1d_7 (Conv1D)               (None, 36, 50)       75050       embedding_3[0][0]                \n",
      "__________________________________________________________________________________________________\n",
      "max_pooling1d_4 (MaxPooling1D)  (None, 1, 50)        0           conv1d_5[0][0]                   \n",
      "__________________________________________________________________________________________________\n",
      "max_pooling1d_5 (MaxPooling1D)  (None, 1, 50)        0           conv1d_6[0][0]                   \n",
      "__________________________________________________________________________________________________\n",
      "max_pooling1d_6 (MaxPooling1D)  (None, 1, 50)        0           conv1d_7[0][0]                   \n",
      "__________________________________________________________________________________________________\n",
      "concatenate_2 (Concatenate)     (None, 1, 150)       0           max_pooling1d_4[0][0]            \n",
      "                                                                 max_pooling1d_5[0][0]            \n",
      "                                                                 max_pooling1d_6[0][0]            \n",
      "__________________________________________________________________________________________________\n",
      "global_max_pooling1d_2 (GlobalM (None, 150)          0           concatenate_2[0][0]              \n",
      "__________________________________________________________________________________________________\n",
      "dense_2 (Dense)                 (None, 46)           6946        global_max_pooling1d_2[0][0]     \n",
      "==================================================================================================\n",
      "Total params: 457,096\n",
      "Trainable params: 457,096\n",
      "Non-trainable params: 0\n",
      "__________________________________________________________________________________________________\n"
     ]
    }
   ],
   "source": [
    "model = one_dim_convolution_model_with_diff_kernels(num_words, embed_size, max_len, kernel_sizes)\n",
    "model.summary()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "history = model.fit(X_train, y_train, epochs = 100, batch_size = 100)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "399/399 [==============================] - 0s     \n"
     ]
    }
   ],
   "source": [
    "result = model.evaluate(X_test, y_test)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Test Accuracy:  0.897243108218\n"
     ]
    }
   ],
   "source": [
    "print('Test Accuracy: ', result[1])"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
